56 research outputs found

    Widening siamese architectures for stereo matching

    Get PDF
    Computational stereo is one of the classical problems in computer vision. Numerous algorithms and solutions have been reported in recent years focusing on developing methods for computing similarity, aggregating it to obtain spatial support and finally optimizing an energy function to find the final disparity. In this paper, we focus on the feature extraction component of stereo matching architecture and we show standard CNNs operation can be used to improve the quality of the features used to find point correspondences. Furthermore, we use a simple space aggregation that hugely simplifies the correlation learning problem, allowing us to better evaluate the quality of the features extracted. Our results on benchmark data are compelling and show promising potential even without refining the solution

    Computer Vision in the Surgical Operating Room

    Get PDF
    Background: Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary: In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages: With the increasing availability of surgical video sources and the convergence of technologiesaround video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic

    RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints

    Get PDF
    In robotic surgery the motion of instruments and the laparoscopic camera is constrained by their insertion ports, i. e. a remote centre of motion (RCM). We propose a Simultaneous Localisation and Mapping (SLAM) approach that estimates laparoscopic camera motion under RCM constraints. To achieve this we derive a minimal solver for the absolute camera pose given two 2D-3D point correspondences (RCMPnP) and also a bundle adjustment optimiser that refines camera poses within an RCM-constrained parameterisation. These two methods are used together with previous work on relative pose estimation under RCM [1] to assemble a SLAM pipeline suitable for robotic surgery. Our simulations show that RCM-PnP outperforms conventional PnP for a wide noise range in the RCM position. Results with video footage from a robotic prostatectomy show that RCM constraints significantly improve camera pose estimatio

    Catheter segmentation in X-ray fluoroscopy using synthetic data and transfer learning with light U-nets

    Get PDF
    Background and objectivesAutomated segmentation and tracking of surgical instruments and catheters under X-ray fluoroscopy hold the potential for enhanced image guidance in catheter-based endovascular procedures. This article presents a novel method for real-time segmentation of catheters and guidewires in 2d X-ray images. We employ Convolutional Neural Networks (CNNs) and propose a transfer learning approach, using synthetic fluoroscopic images, to develop a lightweight version of the U-Net architecture. Our strategy, requiring a small amount of manually annotated data, streamlines the training process and results in a U-Net model, which achieves comparable performance to the state-of-the-art segmentation, with a decreased number of trainable parameters. MethodsThe proposed transfer learning approach exploits high-fidelity synthetic images generated from real fluroscopic backgrounds. We implement a two-stage process, initial end-to-end training and fine-tuning, to develop two versions of our model, using synthetic and phantom fluoroscopic images independently. A small number of manually annotated in-vivo images is employed to fine-tune the deepest 7 layers of the U-Net architecture, producing a network specialized for pixel-wise catheter/guidewire segmentation. The network takes as input a single grayscale image and outputs the segmentation result as a binary mask against the background. ResultsEvaluation is carried out with images from in-vivo fluoroscopic video sequences from six endovascular procedures, with different surgical setups. We validate the effectiveness of developing the U-Net models using synthetic data, in tests where fine-tuning and testing in-vivo takes place both by dividing data from all procedures into independent fine-tuning/testing subsets as well as by using different in-vivo sequences. Accurate catheter/guidewire segmentation (average Dice coefficient of ~ 0.55, ~ 0.26 and ~ 0.17) is obtained with both U-Net models. Compared to the state-of-the-art CNN models, the proposed U-Net achieves comparable performance ( ± 5% average Dice coefficients) in terms of segmentation accuracy, while yielding a 84% reduction of the testing time. This adds flexibility for real-time operation and makes our network adaptable to increased input resolution. ConclusionsThis work presents a new approach in the development of CNN models for pixel-wise segmentation of surgical catheters in X-ray fluoroscopy, exploiting synthetic images and transfer learning. Our methodology reduces the need for manually annotating large volumes of data for training. This represents an important advantage, given that manual pixel-wise annotations is a key bottleneck in developing CNN segmentation models. Combined with a simplified U-Net model, our work yields significant advantages compared to current state-of-the-art solutions

    RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints

    Get PDF
    In robotic surgery the motion of instruments and the laparoscopic camera is constrained by their insertion ports, i. e. a remote centre of motion (RCM). We propose a Simultaneous Localisation and Mapping (SLAM) approach that estimates laparoscopic camera motion under RCM constraints. To achieve this we derive a minimal solver for the absolute camera pose given two 2D-3D point correspondences (RCM-PnP) and also a bundle adjustment optimiser that refines camera poses within an RCM-constrained parameterisation. These two methods are used together with previous work on relative pose estimation under RCM [1] to assemble a SLAM pipeline suitable for robotic surgery. Our simulations show that RCM-PnP outperforms conventional PnP for a wide noise range in the RCM position. Results with video footage from a robotic prostatectomy show that RCM constraints significantly improve camera pose estimation

    HAPNet: hierarchically aggregated pyramid network for real-time stereo matching

    Get PDF
    ©Recovering the 3D shape of the surgical site is crucial for multiple computer-assisted interventions. Stereo endoscopes can be used to compute 3D depth but computational stereo is a challenging, non-convex and inherently discontinuous optimisation problem. In this paper, we propose a deep learning architecture which avoids the explicit construction of a cost volume of similarity which is one of the most computationally costly blocks of stereo algorithms. This makes training our network significantly more efficient and avoids the needs for large memory allocation. Our method performs well, especially around regions comprising multiple discontinuities around surgical instrumentation or around complex small structures and instruments. The method compares well to the state-of-the-art techniques while taking a different methodological angle to computational stereo problem in surgical video

    Comparing the performance of conventional and robotic catheters in transcatheter aortic valve implantation

    Get PDF
    In this paper we investigate the performance of a recently developed robotic catheterization platform in comparison to conventional surgical equipment. Transcather aortic valve implantation (TAVI) was chosen as the test case and 12 interventionists (6 experts and 6 novices) participated in experiments with a silicon aorta model. Video sequences of the fluoroscopic monitor, used for guiding the instruments, were captured and processed with specialized software. To evaluate and compare the two systems the 2-D position of the catheter/guidewire tip is tracked and the shape of the phantom model is extracted in the video frames. In our analysis, we focus on three metrics; the procedure time, the average speed and the average distance to the vessel wall. The obtained results show that procedure time is capable of discriminating the participants of the different experience groups, achieving p=0.008 in the first stage of the experiment. In addition, experts consistently exhibit a higher average speed than novices. Ultimately, the increased average distance to the vessel wall demonstrated by the robotic system is an indication of improved precision and safer catheter/guidewire navigation

    Arthroscopic simulation using a knee model can be used to train speed and gaze strategies in knee arthroscopy

    Get PDF
    Purpose This study aimed to determine the effect of a simulation course on gaze fixation strategies of participants performing arthroscopy. Methods Participants (n = 16) were recruited from two one-day simulation-based knee arthroscopy courses, and were asked to undergo a task before and after the course, which involved identifying a series of arthroscopic landmarks. The gaze fixation of the participants was recorded with a wearable eye-tracking system. The time taken to complete the task and proportion of time participants spent with their gaze fixated on the arthroscopic stack, the knee model, and away from the stack or knee model were recorded. Results Participants demonstrated a statistically decreased completion time in their second attempt compared to the first attempt (P = 0.001). In their second attempt, they also demonstrated improved gaze fixation strategies, with a significantly increased amount (P = 0.008) and proportion of time (P = 0.003) spent fixated on the screen vs. knee model. Conclusion Simulation improved arthroscopic skills in orthopaedic surgeons, specifically by improving their gaze control strategies and decreasing the amount of time taken to identify and mark landmarks in an arthroscopic task

    Gesture Classification in Robotic Surgery using Recurrent Neural Networks with Kinematic Information

    Get PDF
    In this work we introduce the application of Recurrent Neural Networks (RNNs) on surgical kinematic data, for the classification of gestures in three fundamental surgical tasks (suturing, needle passing knot tying). The developed RNN-based classifier achieves close to 60% average classification accuracy for all three tasks when trained and tested with dVSS kinematic data from the same operator. Our preliminary work indicates that this type of artificial neural networks can be the building blocks in gesture classification systems which can form the basis for further developing automated skill assessment methods in robotic surgery
    • …
    corecore